Marketing Analytics Process

Predictive Modeling Workflow

Deep Learning

Deep learning is a type of machine learning that uses neural networks to essentially automate feature engineering by progressively extracting regular features from the data.

  • A key focus for artificial intelligence, which has been studied for a long time, is to create a model that mimics the way the human mind works.
  • Neural networks are a result of these (mixed results) efforts, with hidden units (a.k.a., neurons) encoding different feature engineering-like transformations of the data in order to predict the outcome.

Simple, right?!

A Neuron Model

  • \(x_1\) through \(x_3\) are the predictor inputs.
  • The output is the outcome variable \(y\).
  • The neuron combines the inputs with some weight for each and then transforms the combination before passing it on.
  • Together this combination function and transfer function are called an activation function.

Each neuron is a linear combination of the inputs (and then a transformation) based on a given activation function.

If we have a binary outcome variable with a softmax (i.e., logistic) activation function, this neural network would be logistic regression.

Neural Networks

What makes it a neural network is including multiple neurons (i.e., hidden units) and potentially multiple hidden layers (hidden just means they aren’t inputs or outputs).

  • Each hidden unit has its own activation function.
  • Each hidden layer takes the outputs of the previous layer as inputs.

This sequence of activation functions can approximate any function, essentially learning the feature engineering that results in better predictions.

Ready to Train a Neural Network?

Install the nnet package!

Preliminaries

Load libraries and set a random seed.

# Load packages, set seed.
library(tidyverse)
library(tidymodels)
set.seed(42)

Data Wrangling

# Import data, wrangle S1 into segment, coerce factors, and select predictors.
roomba_survey <- read_csv(here::here("Data", "roomba_survey.csv")) |> 
  rename(segment = S1) |> 
  mutate(
    segment = case_when(
      segment == 1 ~ "own",
      segment == 3 ~ "shopping",
      segment == 4 ~ "considering"
    ),
    segment = factor(segment),
    D2HomeType = factor(D2HomeType),
    D3Neighborhood = factor(D3Neighborhood),
    D4MaritalStatus = factor(D4MaritalStatus),
    D6Education = factor(D6Education)
  ) |> 
  select(
    segment, contains("RelatedBehaviors"), contains("ShoppingAttitudes"), 
    D1Gender, D2HomeType, D3Neighborhood, D4MaritalStatus, D6Education
  )

Feature Engineering

Normalizing all of the predictors is common for neural networks. Why?

  • Improved training speed.
  • Stabilization of model training.
  • Prevents one feature from dominating the model.

\(X^{norm} = \frac{X - \mu_X}{\sigma_X}\)

# Split data based on segment.
roomba_split <- initial_split(roomba_survey, prop = 0.75, strata = segment)

# Feature engineering.
roomba_recipe <- training(roomba_split) |>
  recipe(
    segment ~ .
  ) |>
  step_dummy(all_nominal_predictors()) |> 
  step_zv(all_predictors()) |>
  step_normalize(all_predictors())

Fit a Multilayer Perceptron

There are many different types of neural networks. A common one is a multilayer perceptron, which has a single hidden layer.

# Set the model, engine, and mode.
nn_model_01 <- mlp() |> 
  set_engine(engine = "nnet") |> 
  set_mode("classification")

nn_model_01
## Single Layer Neural Network Model Specification (classification)
## 
## Computational engine: nnet

Create a workflow that combines the recipe and model and fit it.

# Create a workflow.
nn_wf_01 <- workflow() |> 
  add_recipe(roomba_recipe) |> 
  add_model(nn_model_01)

# Fit the workflow.
nn_fit_01 <- fit(nn_wf_01, data = training(roomba_split))

What’s Our Next Step?

Model Evaluation

So far, we have repeated the same steps every time we want to evaluate a model.

  • Generate predictions from the fitted model for the test data
  • Join the predictions as a new column in the test data
  • Compute the accuracy using the prediction column and the column of true values

What coding tool can we use to simplify this process a little bit?

Functions Redux

  • We should try to write functions that are as general as possible.
  • To create a tidy function, i.e., a function that can accept an un-quoted variable name as an argument place {{ }} around that variable name wherever it occurs in the function.
fit_accuracy <- function(fit, testing_data, truth) {
  ###
  # - fit: a fitted model
  # - testing_data: test data for which predictions will be generated
  # - truth: name of the target (Y) variable contained in testing_data
  fit |> 
    predict(new_data = testing_data) |>
    bind_cols(testing_data) |>
    accuracy(truth = {{truth}}, estimate = .pred_class)
}

Evaluate Predictive Fit

Let’s look at the accuracy on the testing data.

# Compute model accuracy.
fit_accuracy(nn_fit_01, testing(roomba_split), segment)
## # A tibble: 1 × 3
##   .metric  .estimator .estimate
##   <chr>    <chr>          <dbl>
## 1 accuracy multiclass     0.595

Hyperparameter Tuning

There are a lot of potential hyperparameters to tune, but we’ll focus on three.

  • hidden_units the number of hidden units in the single layer
  • epochs the number of training iterations
  • penalty a penalty term to help fight overfitting

The activation function is set based on the model mode. You can experiment with "relu" instead of the default of "linear" for regression and "softmax" for classification.

How Do We Tune Ourselves Spiritually?

“Teaching, exhorting, and explaining - as important as they are - can never convey to an investigator, a child, a student, or a member a witness of the truthfulness of the restored gospel. Only as their faith initiates action and opens the pathway to the heart can the Holy Ghost deliver confirming witnesses. Missionaries, parents, teachers, and leaders obviously must learn to teach by the power of the Spirit. Of equal importance, however, is the responsibility they have to help others learn for themselves by faith.”

Elder Bednar, “Learning in the Lord’s Way”, October 2018.

# Use v-fold cross-validation based on segment.
roomba_cv <- vfold_cv(training(roomba_split), v = 10, strata = segment)

# Set the model, engine, and mode.
nn_model_02 <- mlp(hidden_units = tune(), epochs = tune(), penalty = tune()) |> 
  set_engine(engine = "nnet") |> 
  set_mode("classification")

# Update the workflow.
nn_wf_02 <- nn_wf_01 |> 
  update_model(nn_model_02)

# Tune the hyperparameters by using the cross-validation.
nn_tune <- nn_wf_02 |> 
  tune_grid(resamples = roomba_cv)
## Warning: package 'nnet' was built under R version 4.4.3

nn_tune |>
  collect_metrics(summarize = FALSE) |>
  filter(.metric == "accuracy") |>
  group_by(.config) |>
  summarize(avg_accuracy = mean(.estimate)) |>
  arrange(desc(avg_accuracy))
## # A tibble: 10 × 2
##    .config               avg_accuracy
##    <chr>                        <dbl>
##  1 Preprocessor1_Model02        0.620
##  2 Preprocessor1_Model04        0.617
##  3 Preprocessor1_Model10        0.609
##  4 Preprocessor1_Model08        0.593
##  5 Preprocessor1_Model05        0.585
##  6 Preprocessor1_Model07        0.580
##  7 Preprocessor1_Model06        0.571
##  8 Preprocessor1_Model09        0.561
##  9 Preprocessor1_Model01        0.560
## 10 Preprocessor1_Model03        0.557

Select the best model and finalize our workflow.

# Select the best fitting model.
nn_tune_best <- nn_tune |> 
  select_best(metric = "accuracy")

# Finalize the workflow.
nn_wf_final <- nn_wf_02 |> 
  finalize_workflow(nn_tune_best)

nn_wf_final
## ══ Workflow ════════════════════════════════════════════════════════════════════
## Preprocessor: Recipe
## Model: mlp()
## 
## ── Preprocessor ────────────────────────────────────────────────────────────────
## 3 Recipe Steps
## 
## • step_dummy()
## • step_zv()
## • step_normalize()
## 
## ── Model ───────────────────────────────────────────────────────────────────────
## Single Layer Neural Network Model Specification (classification)
## 
## Main Arguments:
##   hidden_units = 2
##   penalty = 0.0110234294006162
##   epochs = 891
## 
## Computational engine: nnet

Now let’s fit on the entire training data.

# Fit the tuned workflow to the whole dataset.
nn_fit_02 <- fit(nn_wf_final, data = training(roomba_split))

# Compute model accuracy.
fit_accuracy(nn_fit_02, testing(roomba_split), segment)
## # A tibble: 1 × 3
##   .metric  .estimator .estimate
##   <chr>    <chr>          <dbl>
## 1 accuracy multiclass     0.631

Final Thoughts

Neural networks can be powerful prediction engines. However, they often require a lot of data and can be slow to train. As with every predictive model, there is no guarantee it will predict best for a given application.

While adding additional hidden units can help uncover finer structures in the data, adding additional hidden layers allows for increasingly nonlinear relationships between the predictors and the outcome.

However, additional hidden layers will require a neural network more sophisticated than a multilayer perceptron.

Wrapping Up

Summary

  • Introduced deep learning and neural networks.
  • Demonstrated running a multilayer perceptron, including hyperparameter tuning.

Next Time

  • Ensembles.

Supplementary Material

Exercise 18

Return to the set of models from the previous two exercises.

  1. Set up workflows for each of the three model types (decision tree, random forest, neural network). Replicate the feature engineering we did in class and use the same recipe (create dummies, remove constant features, standardize features) for all models.
  2. Train and tune each model using cross-validation on the training data (no cross validation required for the decision tree).
  3. Create a data frame comparing the performance of all three models on the test data. Which is the best-fitting model so far? Use confusion matrices to explore how the best-fitting model is doing better than the competing models. Interpret what you find. For example, is one of the models notably better at predicting one or more of the segments of interest?
  4. Render the Quarto document into Word and upload to Canvas.